Photometric Supernova Cosmology with BEAMS and SDSS-II 
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Supernova cosmology without spectroscopic confirmation is an exciting new frontier which we 
address here with the Bayesian Estimation Applied to Multiple Species (BEAMS) algorithm and 
the full three years of data from the Sloan Digital Sky Survey II Supernova Survey (SDSS-II SN). 
BEAMS is a Bayesian framework for using data from multiple species in statistical inference when 
one has the probability that each data point belongs to a given species, corresponding in this context 
to different types of supernovae with their probabilities derived from their multi-band lightcurves. 
We run the BEAMS algorithm on both Gaussian and more realistic SNANA simulations with of 
order 10* supernovae, testing the algorithm against various pitfalls one might expect in the new and 
somewhat uncharted territory of photometric supernova cosmology. We compare the performance 
of BEAMS to that of both mock spectroscopic surveys and photometric samples which have been 
cut using typical selection criteria. The latter typically are either biased due to contamination or 
have significantly larger contours in the cosmological parameters due to small data-sets. We then 
apply BEAMS to the 792 SDSS-II photometric supernovae with host spectroscopic redshifts. In 
this case, BEAMS reduces the area of the f2m, contours by a factor of three relative to the case 
where only spectroscopically confirmed data are used (297 supernovae). In the case of flatness, the 
constraints obtained on the matter density applying BEAMS to the photometric SDSS-II data are 
qBeams ^ 0.194 ±0.07. This illustrates the potential power of BEAMS for future large photometric 
supernova surveys such as LSST. 

PACS numbers: 
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I. INTRODUCTION 

The unexpected faintness of distant Type la Super- 
novae (SNIa) was the key to the discovery of late-time 
cosmic acceleration A decade later, the discovery 

and analysis of large numbers of high-quality SNIa data 
remain cornerstones of modern cosmology, with various 
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surveys probing SNIa over a huge range of distances, 
with a particular focus on understanding and remov- 
ing potentially unaccounted-for systematic errors and 
sharpening them as standard candles (e.g. [3H22]). 
The current state-of-the-art is a heterogenous sample 
of hundreds of SNe, predominantly at intermediate red- 
shifts, z <1 with a high-redshift, z > 1, 
sample from the Hubble Space Telescope 0, anchored 
with a low redshift sample, z < 0.02 [TH [101 HBHIS] . 
The SDSS-II SN survey data [TUl [12] fill in the 'red- 
shift gap' between 0.02 < z < 0.4. In these surveys, 
multi-band photometric light-curves are very success- 
fully used to estimate the probability that a candidate 
is a SNIa as opposed to a core-collapse supernova (Ibc 
or II) or other object, providing vital intelligence for 
the selection of likely SNIa for spectroscopic follow-up 
[5DH1T], if not currently used for making Hubble dia- 
grams. 

Future surveys such as the Dark Energy Survey (DES, 
[H]), Pan-STARRS and the Large Synoptic Survey 
Telescope (LSST, 45) will vastly increase the numbers 
of detected SNe, perhaps by a factor of a thousand in 
the case of LSST. However, the quandary facing these 
surveys is how to make appropriate use of this surfeit 
of data given that spectroscopic confirmation will only 
be possible for a small fraction of the promising SNIa 
candidates, varying somewhere between — 20%. The 
wealth of current and future data is therefore driving 
us inexorably towards an era of purely photometric su- 
pernova cosmology in which most of the cosniological 
constraints from the survey come from supernovae with 
no spectroscopic information, except perhaps for a host 
redshift obtained with a multi-object spectrograph. 
Photometric supernova cosmology is not a task to 
be undertaken lightly. While multi-band photometric 
methods strive to reduce the amount of contamination 
of the la sample from interlopers to a minimum, there 
will always be some level of contamination - typically 
around the few percent level [371 IMl HH US] - and this 
biases the inferred cosmological constraints at an unac- 
ceptable level if one simply uses standard inference 
techniques. In this standard paradigm one is faced 
with a choice between inevitable contamination using 
all data, restricting the sample size to those supernovae 
that can be followed up spectroscopically, wasting the 
available data at hand - or defining a smaller subset 
from the photometric candidates that has a high Type 
la purity (sec ^6] for one such treatment). Fortunately 
one can rigorously incorporate contamination effects 
into a Bayesian inference framework to yield unbiased 
cosmological results. 

In this paper we apply the resulting framework: 
Bayesian Estimation Applied to Multiple Species 



(BEAMS, Kunz et al. [17]) to purely photometric SN 
data with host galaxy redshift information. We test 
the algorithm against various simulations, and describe 
potential challenges in future photometric analyses. In 
addition, we apply the algorithm to the photometric 
SDSS-II SN data sample with host galaxy redshifts. 
While still in its developing stages, photometric su- 
pernova cosmology is a very promising approach to ex- 
ploit the deluge of supernova data expected in the next 
decade, and we show how BEAMS is one approach that 
is robust to general assumptions about the SN popu- 
lation. 



II. THE BEAMS FRAMEWORK FOR 
PHOTOMETRIC SN COSMOLOGY 

A. Basic formalism 

The current state-of-the-art is to restrict any contam- 
ination from non-SNIa interlopers by only doing cos- 
mology on those candidates that have been spectro- 
scopically confirmed as being Type la through the iden- 
tification of characteristic absorption lines such as the 
Si-II 6150 A feature [li \M- While this strategy is 
feasible for the current level of precision, using only 
spectroscopically confirmed supernovae in the cosmol- 
ogy analysis from future large surveys such as DES and 
LSST will result in throwing away the great majority 
of interesting candidates. 

BEAMS was developed to make the most of the up- 
coming large datasets [IT]- It is a general Bayesian 
framework that allows use of all available candidates 
provided we have some indication of how likely they are 
to be a SNIa. While BEAMS is a general method for es- 
timating parameters from any type of data which may 
be contaminated, it is readily applied to the SN prob- 
lem, where we wish to evaluate the posterior distribu- 
tion P{9\d), the probability of cosmological parameters 
which we denote hy 6 = {6i,92, ■■■,0j,9m), given the 
SN data, expressed as a vector d — (di, c?2, di, ...djv) 
of N measurements (of, for example, the distance mod- 
ulus /i or apparent magnitude m in the case of SN 
data) . 

To apply BEAMS we invoke a theoretical binary vec- 
tor T = (ti, r2, Ti, ...Tjv) of length N, equal to the 
number of data points. The entries of this vector are 
one if the corresponding data point is a Type la super- 
nova (SNIa) and zero if the point is not (for e.g. Type 
Ibc, II or non-SN), that is, = 1(0) if the i-th data 
point is (or is not) a SNIa. This represents the under- 
lying 'truth', however we shall see later that we will 
use a proxy for the true type: the 'probability' of be- 



Figure 1: 3D photometric Hubble diagram for the SDSS-II data: residuals relati ve to the input cosmology for 
the SDSS-II SN photometric data with host spectroscopic redshifts (discussed in Section IIICl. SNe are measured in 
both redshift and distance modulus space, but the BEAMS algorithm includes probability information, adding a third 
dimension to standard SN cosmology. The Type la's clearly show up on the left at high probability, but with some small 
contamination that must be accounted for. 



ing a la. In general the class of 'non-SNIa' supernovae 
can be subdivided into many subclasses; in which case 
T would not be a binary vector but would index the 
possible sub-classes. Here we consider only the simple 
binomial case. 

Applying Bayes' Theorem and marginalizing over all 
possible values of the vector r = ti, T2, r^v, the gen- 
eral posterior becomes [17]: 

P(0|d)=^P(0,r|d)^^P(d|0,r)^^^, (1) 

where P(0, r) is the prior for the parameters and P(d) 
is the usual evidence factor which does not depend on 
the cosmic parameters. 

Assuming that the data are uncorrelated we then split 
the effective posterior into two parts for each i-th data 
point: 

p{d\e,T)P{T% = 

[P{d,\e, n = i)-P,: + P{d.\e, n = 0)(l - Pi)] (^) 

where Pi = P{Ti = 1), is the probability that a given 
point is in fact a SNIa, P{di\6, n = 1) is the likelihood 



of the la distribution, and P{di\6,Ti = 0) is the non- 
SNIa likelihood. This assumption of uncorrelated data 
is crucial in separating the contributions of the la and 
non-la populations to the final posterior, and relaxing 
this assumption will require a more complex statistical 
description. 

The probabilities Pi are determined through fitting 
light curves to standard SNIa models such as SALT2 
[50] or MLCS2k2 which typically assume that the 
data belong to the SNIa class and hence fit SNIa light- 
curve templates to all the data points. The Type la 
probability can either be obtained using a goodness- 
of-fit of the light-curves normalized by the degrees of 
freedom (dof): 

P, = Pfit(xexp(xL/dof), (3) 

or by fitting multiple templates to the data and obtain- 
ing probabilities from the relative of the fits from 
different SN templates (la, Ibc, II, etc.), Ptypor, such 
as the PSNID typer [SllEHl- 

The probability ([3]) represents how well a light-curve 
fits the photometric magnitudes, and does not tell you 
a priori how likely the data point is to be a la. For 
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example if both la and II templates fit the data equally 
well with a xk^/dof = 0.95, then the relative probabil- 
ity of each type is Pi = 0.5, while the probability of be- 
ing a la obtained only from the fit of the la light-curve 
to the data is Pjgt = 0.65. Typically the probabilities 
are combined with additional selection cuts [351 
Hence in general the conversion from a normalized X\c 
to Pi can lead to skewed probabilities (see Newling 
et al. [52] for a detailed investigation of potential bias 
from incorrect assumptions about the probabilities - 
we test for distance modulus-probability correlations 
in Appendix [A|), as selection cuts, which are typically 
based on supernova rate or intrinsic brightness can in- 
troduce redshift-dependent selection bias. In addition, 
a data set may contain very different numbers of Type 
la and core-collapse supernovae. For this reason we 
add a global parameter A that re-normalizes the rela- 
tive probability (the Bayes factor) of being of Type la 

or not through Pla,i/(1 - Pla,*) -J> APia,i/(l - Pu^^). 

The final probability that enters the BEAMS likelihood 
is then 



r,(A) 

la, 2 



Pi. 



AP 



where we estimate A simultaneously with the other pa- 
rameters (subject to a Jeffreys' prior, i.e. we sample 
uniformly in In^, although the results are not depen- 
dent on the prior used). This mapping provides an in- 
dication of whether or not the input probabilities (from 
the light curve fitter for example) are biased, as we ex- 
pect the distribution to be peaked around one. The 
re-mapping of probabilities allows BEAMS to 'correct' 
for bias in the input probabilities. This parameter A 
is necessary to debias the probabilities with respect to 
the overall la/non-Ia ratio of the whole sample even 
if the light-curve fitter gives a perfect 'per SN' prob- 
ability. We discuss this in detail in Section |IVD| In 
general one might allow A to vary with redshift, or 
indeed with the light-curve model used, given the vari- 
ety of assumptions made by the light-curve fitters. We 
leave these tests to future work and in this analysis we 
consider only one global parameter A. 
In addition, it is important to note that in applying the 
BEAMS algorithm we do not assume a known popula- 
tion of la (or non-la) SNe, and hence no probabilities 
are set to zero or unity in the analysis, even if they 
have been spectroscopically confirmed, as the number 
of known las will be much smaller than the total pho- 
tometric sample in future surveys. 



The total BEAMS posterior is then 
Pi9\d) oc P(6>)x 

N 

n {p{d^\9, n = i)<i + P{d.\9, n = 0) (l - pI^}) } 

(5) 



where the parameter vector 6 now contains the cosmo- 
logical parameters {7?o, ^^m, ^^a}, the probability pa- 
rameter A and the extra parameters necessary to model 
the supernova likelihoods as discussed below. P{d) 
represents the prior probabilities of the parameters. If 
we are interested only in the cosmological parameters 
then we marginalize over all the others. We will now 
discuss in detail our choice of the Type la and non- 
SNIa likelihoods. 



B. The likelihood distribution for SNIa 

The la likelihood is modeled as a Gaussian probabil- 
ity distribution function (pdf) for the observed dis- 
tance modulus /ii centered around the theoretical value 



(4) fj,{z, 9) with a variance cr^ 



P{ii.,\9,n = 1) 



27r(T, 



1 / {ix,-ii{z,,9)f 

exp - 



tot,i 



2at 



(6) 

The distance modulus is related to the cosmological 
model via: 



/i(z,0) = b\ogdL{z,9) + 2b, 



(7) 



where 



dL (z, 9) = 4=77^ X sinh ( y/Vlk 




(8) 



is the luminosity distance measured in Mpc, the ex- 
pansion rate is given by 



E{z) = Va„(i + z)3 + nk{i + z)2 + n^Efiz). (9) 

The energy densities relative to flatness of matter 
(flm), curvature (flk) and dark energy (JIde) obey the 
relation + + J^de = 1- The distance modulus 
is defined as the difference between the absolute and 
apparent magnitudes of the supernova, fi — m — Al, 
with additional corrections made to the apparent mag- 
nitude for the correlations between brightness, color 
and stretch and a K-correction term related to the dif- 
ference between the observer and rest-frame filters, for 



5 



example. The corrections are typically made within the 
model employed in a light-curve fitter, such as that for 
MLCS2k2. 

The parameter Hq is the Hubble constant, and the 
function f{z) = /Cde(-2)/pde(0) describes the evolu- 
tion of the dark energy density. While one of the ul- 
timate goals in SN cosmology is to test for dynamics 
with redshift of the equation of state w, this relies on 
a sample at both high and low redshift to anchor the 
Hubble diagram and provide a long lever arm. In this 
work, we discuss how BEAMS improves constraints on 
parameters when including a photometric sample, and 
hence do not include the low or high redshift samples 
in this case. For this reason we focus on the flrm^A 
combination of cosmological parameters, and so will 
only consider ACDM models for which /(z) = 1. In 
principle we should also consider radiation, but its en- 
ergy density is negligible at late times when we observe 
supernovae. 

In this application of BEAMS we have assumed that 
the distance modulus fj, is obtained directly from the 
light-curve fitter (such as is the case for fitters which 
use the MLCS2k2 light-curve model), however this is 
not an implicit assumption. In the case of the SALT 
light-curve fitter, the distance modulus would be re- 
constructed using a framework such as that outlined 
in [53] before including in the BEAMS algorithm. 
We model the error on the distance modulus of each su- 
pernova as a sum in quadrature of several independent 
contributions, 



where (T^t^i is the error obtained from fits to the SN 
light-curve, (Tt- is the characteristic intrinsic dispersion 
of the supernova population, which we add as an addi- 
tional global parameter to the vector 9 with Jeffreys' 
prior. The constraints do not depend strongly on the 
prior used for the intrinsic dispersion. The error term 
<^fi.z converts the uncertainty in redshift due to mea- 
surement errors and peculiar velocities into an error in 
the distance of the supernova as: 



l + z 



ln(10)z(l + z/2) 



\^? + ^W^, (11) 



with az as redshift error, and Vpec as the typical am- 
plitude of the peculiar velocity of the supernova, which 
we take as 300 kms'^ pUlITT]. 

In general, light-curve models such as SALT2 [SU] or 
MLCS2k2 [51 are used to fit fluxes in various bands 
and time epochs to obtain a distance modulus. The two 
light-curve models are based on different approaches 
and hence make different assumptions about the under- 
lying SN properties. In general one might also include 



a systematic error due to differences in distance modu- 
lus from using different light-curve fitters as discussed 
in Kessler et al. [TU] . However, given that we are fitting 
the light-curves using only the MLCS2k2 model in this 
analysis, and as we are interested in the relative im- 
provement of constraints when applying BEAMS, we 
ignore this constant systematic error without loss of 
generality. 



C. Forms of the non-SNIa likelihood 

The general form of the non-SNIa likelihood will be 
complicated since there are several sub-populations. 
Given the limited number of non-SNIa in the SDSS- 
II SN data set however, (see Figure [5]) we will model 
it with a single mean and a dispersion. If one chooses 
to describe a population using only a mean and a vari- 
ance, statistically the least-informative (maximum en- 
tropy) choice of pdf in this case is also a Gaussian [SI] , 



P{fi,\e,n = 0) = 



1 



2nst 



■ exp 



2s2 



(12) 

As we do not know the mean 77 and variance sl^^. j of 
the non-SNIa population, we describe them with ad- 
ditional parameters. We will keep the parametrization 
of the mean very general (see below) but for the vari- 
ance we restrict ourselves to the same form as for the 
Type la supernovae, Eq. (10 1, but with a potentially 



(10) 

different intrinsic dispersion described by an inde- 



pendent parameter (again with a Jeffreys' prior). We 
assume that the measurement errors and the contribu- 
tion from the peculiar velocities enter in the same way 
for Type la and other supernovae and so keep these 
terms identical. 

We do not know what to expect for the mean of the 
non-SNIa pdf and so we allow for a range of possi- 
bilities. As the brightness is linked to the luminos- 
ity distance through Eq. ([7|, we describe the expected 
non-SNIa distance modulus (as provided by the light- 
curve fitter) as a deviation from the theoretical value, 
77(z, 6) — fi{z, 6) + T(z), where we consider the follow- 
ing Taylor expansions of the difference as a function of 
redshift: 



T(z) =?7(z,6») -/i(z,0) cx: ^(a,z')/(l + dz) 



(13) 



We consider the cases where we set different combina- 
tions of the parameters (ai,d), to zero, and employ a 
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criterion based on model probability to decide which 
of these functions to use. We note that the explicit 
link of 77(2, 6) to fi{z, 9) carries a risk that the non- 
SNIa likelihood can influence the posterior estimation 
of the cosmological parameters. For this reason we ver- 
ify that the contours do not shift when we set directly 
ri{z,6) = T(z), although we will need a higher-order 
expansion in general (and of course the recovered pa- 
rameters of the function T(z) will change). In general, 
as long as the basis assumed has enough freedom to fit 
the deviation in distance modulus of the non-SNIa pop- 
ulation from the la model, the inferred cosmology will 
not be biased. 

For a cosmological analysis we just marginalize over 
the values of the parameters in T{z), but these param- 
eters contain information on the distribution of non-la 
type SN and thus their posterior is of interest as well, 
allowing us to gain insight into the distribution charac- 
teristics on the non-SNIa population at no additional 
'cost'. 

The simple binomial case considered here, where the 
non-SNIa population consists of all types of core- 
collapse SNe, is probably too simplistic to accurately 
describe the distribution of non-SNIa supernovae. In 
general one could include multiple populations, one for 
each supernova type, which would yield a sum of Gaus- 
sian terms in the full posterior. In addition, the forms 
describing the distance modulus of the non-SNIa pop- 
ulation are chosen to minimize the cosmological infor- 
mation from the non-SNela (we always test for a devia- 
tion from the cosmological distance modulus), however, 
the parameterization of the non-SNIa distance mod- 
ulus could be improved by investigating the distance 
modulus residuals from simulations, as the major con- 
tributions to the distance modulus residuals appear to 
be the core-collapse luminosity functions, along with 
the specific survey selection criteria and limiting mag- 
nitude, see Falck et al. |55|. While current SN samples 
do not include a large sample of non-SNIa data to test 
for this, larger data sets (such as the data from the 
BOSS SN survey) will allow for a detailed analysis of 
the number (and form) of distributions describing the 
contaminant population. 



D. Markov Chain Monte Carlo Methods 

In this work, the BEAMS algorithm is implemented 
within a Markov Chain Monte Carlo (MCMC) frame- 
work, and the Metropolis-Hastings ^56| acceptance cri- 
terion was used. We use the cosmological parameters 



in the case of the approach on the spectra and cut 
samples described below, and add additional parame- 
ters 



(15) 



(14) 



in the case of the BEAMS application. The parame- 
ters a — {a", a^, a^}; d = a? = Q in Eq. (15) are for 
the quadratic model, in the other models for T{z) we 
adjust the parameters accordingly. The chains were 
in general run for around 100 000 steps per model; 
this was sufficient to ensure convergence. We test for 
convergence using the techniques described in Dunk- 
ley et al. j57^. We impose positivity priors on the 
energy densities of matter and dark energy, and im- 
pose a fiat prior on the Hubble parameter between 
20 < iJo < 100 kms-^Mpc-^ The Hubble param- 
eter is marginalized over given that we do not know 
the intrinsic brightness of the supernovae, but through 
the distance modulus are only sensitive to the relative 
brightness of the supernovae. We impose broad Gaus- 
sian priors on the parameters of the non-SNIa likeli- 
hood function, and step logarithmically in the proba- 
bility normalization parameter A, the intrinsic disper- 
sion parameters of both the la and non-SNIa distribu- 
tions. 



E. Comparison to standard methods 

The primary difference between BEAMS and current 
methods is that the latter either require that all data 
are spectroscopically confirmed, or apply a range of 
quality cuts based on selection criteria. In this paper 
we will compare the performance of BEAMS to these 
two approaches, by processing the data that pass 
the required selection criteria using the la likelihood, 
Eq. (|6). We will hereafter refer to this as the 
approach. 

We use the following samples: 

• spectro sample: 

The sample containing only spectroscopically 
confirmed supernovae. In addition to spectro- 
scopic confirmation we will also apply a cut on 
the goodness-of-fit probability from the light- 
curve templates within the MLCS2k2 model, 
Pfit > 0.01, and a cut on the light-curve fit- 
ter parameter A > —0.4, where A is a param- 
eter in the MLCS2k2 model describing the light- 
curve width-luminosity correlation. MLCS2k2 
was trained using the range —0.4 < A < 1.7 i51j , 
hence we restrict the sample to A > —0.4, which 
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is a cut typical in current SN surveys, and so we 
introduce the cut to provide comparison between 
datasets. We process this spectra sample using 
the approach. 

• cut sample: 

This larger sample is selected both by remov- 
ing 5a outliers from a moving average fit to the 
Hubble diagram including both photometric and 
spectroscopically confirmed data and applying a 
cut to the sample, including only data with a 
high enough probability, Ptypcr > 0.9 (where the 
probability comes from a general supernova typ- 
ing procedure, such as PSNID, described in Sako 
et al. [371 I3H])- We choose to use the PSNID 
probabilities to make the probability cut on the 
sample (Ptypor > 0.9); if the MLCS2k2 proba- 
bilities had themselves been used to make a cut 
sample, then objects would only be included if 
they had probabilities greater than, for exam- 
ple, Pfit > 0.9 In addition, we impose a cut on 
the goodness-of-fit of the light-curve data to the 
Type la typer, xfc < l-^i a cut on the goodness- 
of-fit probability from the light-curve templates 
within the MLCS2k2 model, Pfit > 0.01, and a 
cut on A > —0.4. In this cut sample case we then 
use standard the cosmological fitting proce- 
dure on the sample, and so set the la probability 
of all points to one. 

• photo sample: 

This sample is the one to which BEAMS will be 
applied, and will include all the photometric data 
with host galaxy redshifts. As in the previous 
two cases, we include only data which have Pfit > 
0.01, A > -0.4. 

Note that the spectra sample will be included in all the 
three samples described above. While the spectra and 
cut samples have by definition Pia = 1 (as they are an- 
alyzed in the approach) , we do not set the probabil- 
ities to unity when applying BEAMS to the full sample 
- the spectra subsample within the larger phata sam- 
ple will be treated 'bhndly' by BEAMS. The spectra 
sample is the one most similar to current cosmologi- 
cal samples, and will be used to check for consistency 
in the derived parameters between BEAMS applied to 
the photo sample and the approach on the spectra 
sample. 



III. DATASETS 

We apply BEAMS to three datasets. Firstly we gener- 
ate an ideal simulated dataset where the input la and 



non-SNIa model for distance modulus are known, and 
all data are simulated as Gaussian distributions around 
this model. The second level of simulations are gener- 
ated from SNANA [58] as light-curves and then fit us- 
ing MLCS2k2 [55 , based on an SDSS-II-like dataset. 
The third level is the SDSS-II SN Survey photomet- 
ric data with host-z from 2005 to 2008. The various 
datasets are described below. 



A. Level I: Gaussian simulations 

To test the BEAMS algorithm explicitly we need a 
completely controlled sample, where all variables (such 
as the non-SNIa model, SNIa probabilities) are directly 
known and where we can verify that the algorithm is 
able to recover them correctly. In addition, we use this 
data set to check that we recover the correct shape 
of the non-la distance modulus 77(2) since the true 
77(2, 0) is known for this sample only. We simulate 
a population of 50 000 SNe, with redshifts drawn from 
a Gaussian distribution, z ^ A/'(0.3, 0.15), and dis- 
tance moduli drawn from a flat ACDM universe with 
{n^,nA,Ho,wo,Wa) = (0.3,0.7,70,-1,0). The non- 
SNIa population includes a contribution to the distance 
modulus, 77(2:, 6) = /i(z, 0) + a'^ + a^z + o??^ ^ where we 
choose (a°,a^,a^) — (1.5, 1,-3). We assign P/a proba- 
bilities from a linear model dN/dPia = Aq-\- Ai* Pja- 
We then assign the types from the two samples (of las 
and non-SNela ), i.e. we choose a random number t 
and if t > Pja (i.e. the type also follows the same lin- 
ear relationship as the probability) we take the data 
point to be a la, and if i < Pja we assign it as a 
non-SNIa, until we run out of data points from either 
sample. This procedure reduces the sample size from 
50 000 to 37529. 

We assign a 'measurement error' to each distance mod- 



ulus of cr,. 



0.1; add an intrinsic error cr. 



0.16 



and a peculiar velocity error based on Eq. (10 1, with 
Vpec — 300kms~^. We then randomly scatter to the 
data points based on the total errorbar. To mimic 
what happens in a light-curve fitter, only the measure- 
ment error is recorded, however. When performing pa- 
rameter estimation on the points we either add this 
measurement error in quadrature to the other terms 
whose amplitudes are fixed (in the case of the aP" 
proach), or we estimate the magnitudes of the intrinsic 
dispersion when we apply the BEAMS algorithm. We 
randomly choose 10% of the la data and assign spectra 
status; this represents the data that are followed up by 
large telescopes on the ground. This spectra sample is 
drawn so that we can compare the BEAMS-estimated 
result to the approach on a smaller sample. The 
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Figure 2: Level I Gaussian data: 37529 points simu- 
lated according to a Gaussian distributions around a dis- 
tance modulus in a flat ACDM model for the la population 
(25000 points) and with extra terms up to quadratic order 
in redshift for the non-la population. The points are col- 
ored according to their simulated probabilities from blue 
(low probability) to dark brown (high probability). 
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Figure 3: Level II SNANA simulations: 35815 SNANA 
simulated data points generated from a ACDM concordance 
cosmology and fitted with efficiency corrections as discussed 
in the text, which satisfy the conditions P^t > 0.01; A > 
—0.4. The data contain 30623 SNla, and we define a smaller 
subset (1467 SNe la) as a spectra subsample, to mimic cur- 
rent data. As in Figure [2] the points are colored by proba- 
bility from low (blue points) to high (dark brown points). 



data are shown in Figure [2j In the BEAMS analysis 
we checked on a small number of simulated samples 
that the results obtained were unbiased - a full Monte 
Carlo simulation of bias is beyond the scope of this 
work. 



B. Level II: SNANA simulations 

The previous Gaussian simulation is generated in order 
to test the algorithm for any intrinsic biases in the anal- 
ysis procedure. In order to apply BEAMS to a more 
realistic scenario, we use the Supernova Analysis pack- 
age SNANA [SR| to simulate a mixed sample of Type la 
supernovae and non-SNIa contaminants and to include 
realistic survey characteristics. SNANA contains both 
a simulation module to generate light-curve data, and a 
light-curve fitter that includes the MLCS2k2 model we 
used in this work (both SALT2 and MLCS2k2 are con- 
tained within the SNANA package). This sample pro- 
vided a useful procedure to test the BEAMS algorithm, 
where the final distribution of distance moduli are not 
explicitly given, but rather arise from the generation 
of SN data from light-curve templates, and the fitting 
of those templates with standard light-curve fitters. 
The simulation specifications were chosen based on the 



SDSS-II SN survey characteristics. A sample of 62441 
SNe were simulated between redshifts 0.02 < z < 0.5, 
assuming a ACDM {Q^j^a) = (0.3,0.7) cosmology. 
The simulation was generated using the same char- 
acteristics as in the Supernova Photometric Classifi- 
cation Challenge (SNPCC, see [SS ^ for the simu- 
lation specifcations), where the non-la simulation is 
based on 42 spec-confirmed non-la light curves. The 
cosmology cuts of Pfn > 0.01, A > —0.4 reduce the 
sample from 62441 to 35815. Most of the SNe that 
are cut from the sample are from the non-la sample; 
only 17% of the final sample are non-SNIa. A large 
spectra sample of 13826 SNe were flagged as spectro- 
scopic in MLCS2k2, however we reduce this sample to 
a smaller spectra sample of 1467 SNe (roughly ~ 10% 
of the full spectroscopic Type la sample). The simula- 
tion was generated using an efficiency correction based 
on the full sample of Type la SN, including photomet- 
ric and spectroscopic candidates. Hence the smaller 
spectra sample was taken from the full set of all las in 
the sample (i.e. it is not a subset of those flagged as 
spectroscopically conflrmed) so as not to introduce an 
efficiency bias. 

The data are corrected for a small redshift-dependent 
bias in the fitted distance modulus. This correction is 
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Figure 4: Distance moduli of the Level II SNANA 
simulation: the top panel shows the 35815 SNANA 
normed histogram of the distance moduli residuals (the 
difference of the distance modulus of each point relative 
to the input ACDM cosmology) for the la SNe only, with 
the Gaussian fit to the residuals. The skewness and kurto- 
sis of this distribution are 0.46 and 0.11 respectively. The 
bottom panel shows the normed histogram for the Type 2 
(SN Types Iln, IIP, IIL) and Type 3 (SN Types lb, Ic) pop- 
ulations (the SNANA simulation yields populations gener- 
ated according to many subtypes, as specified in [591 160j: 
here we consider two broader classes for visual purposes). 
The dashed line shows the Gaussian distribution fitted to 
the mean and standard deviation of all the non-la types 
combined. In this case, the simple sum of two Gaussian 
distributions, one for the las and one for the non-la points 
is no longer adequate in describing the simulated model. 
In particular (as can also be seen from Figure [2]|, the sim- 
ulation predicts a population of non-SNela brighter than 
the la population (negative A/i), which is not seen in the 
SDSS-II SN sample, as stringent cuts are typically made to 
the data before obtaining a Hubble diagram. 



determined by comparing the MLCS2k2 fitted modu- 
lus to the input distance modulus in ten redshift bins, 
and varies with by up to 2%. The data are shown in 
Figure [3| The distance moduli residuals of the SNANA 
simulation are binned in Figure |4| illustrating that the 
non-SNIa population is in general not merely a sin- 
gle Gaussian family. In this case the single Gaussian 
assumption in BEAMS will not be entirely accurate, 
however this sub-structure is not yet observed in cur- 
rent data, which has not included large populations of 
non-la supernovae, and hence we leave the multino- 
mial description for future work (see Appendix [b] for a 
discussion of pitfalls in photometric cosmology and fu- 
ture outlook) . In addition, as can be seen from the top 
panel of Figure [4] the la distance moduli are approxi- 
mately Gaussian. One can check whether allowing for 
non-zero skewness and kurtosis through, for example, 
a saddlepoint distribution improves the fit of the data 
- particularly in the tail regions. 



C. Level III: SDSS-II SN photometric data 

The Sloan Digital Sky Survey Supernova Search oper- 
ated for three, three-month long seasons during 2005 
to 2007. We use the photometric supernovae from all 
three seasons of the SDSS-II SN survey which also had 
host galaxy redshifts from the SDSS survey. The analy- 
sis and cosmological interpretation of the first season of 
data (hereafter Fall 2005) are described in Kessler et al. 
[10^, Lampeitl et al. [11], Frieman et al. |61] and [62]. 
The SDSS CCD camera is located on a 2.5 m telescope 
at the Apache Point Observatory in New Mexico. The 
camera operated in the five Sloan optical bands ugriz 
[GJ . The telescope made repeated drift scans of Stripe 
82, a roughly 300 square degree region centered on the 
celestial equator in the Southern Galactic hemisphere, 
with a cadence of roughly four to five days, accounting 
for problems with weather and instrumentation. 
The images were scanned and objects were flagged 
as candidate supernovae [37] • Candidate light-curves 
were compared to a set of supernova light-curve tem- 
plates in the g, r, i bands (consisting of both core- 
collapse and Type la supernovae) as a function of red- 
shift, intrinsic luminosity and extinction. Likely SNIa 
candidates were preferentially followed up with spec- 
troscopic observations of both the candidates and their 
host galaxies (where possible) on various larger tele- 
scopes (see Sako et al. [ST]). 

In addition to the spectroscopically confirmed SNela 
discovered in the SDSS-II SN, many high-quality can- 
didates without spectroscopic confirmation (i.e. only 
photometric observations were made of the SNe) but 
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sample consists of the data points which have a typer 
probabiUty of Ptypor > 0.9 and a goodness-of-fit to the 
hght-curve templates within the PSNID typer [S71I55] . 
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Figure 5; Level III SDSS-II SN data: the photomet- 
ric sample of the full three seasons of SDSS-II SN survey. 
The 792 points are all those with host galaxy spectroscopic 
redshifts. The sample includes 297 spectroscopically con- 
firmed SNe, and are color coded using probabilities from 
the PSNID typer Sako et al. [371 EH] from low (blue) to 
high (dark brown). 



which, by chance, have a host galaxy spectroscopic red- 
shift, are present in the SDSS sample^. 
We include these SNe in both the cut sample and the 
spectra samples in the full photo sample, but do not set 
the probabilities of these points to unity. These super- 
novae are fit with the MLCS2k2 model [SI, to obtain 
a distance modulus for each supernova, assuming the 
supernova is a Type la, in the same way as the Level 

II SNANA simulations. 

As outlined in Section IIIE 



we impose the standard 
selection cuts on the probability of the fit to the 
MLCS2k2 light-curve template > 0.01 and A > 
—0.4 to all data, and require that the data used have 
spectroscopic host galaxy redshift information. Ap- 
plying these cuts to the full three year data yields a 
photometric sample of 792 SNe, with a spectroscopic 
subsample of 297 SNe. The spectra sample consists 
of the objects which have been spectroscopically con- 
firmed by other ground-based telescopes, while the cut 



^ The BOSS survey recently obtained host galaxy redshifts of all 
high-quality SN candidates from all three seasons of the SDSS- 
II Supernova Search. This work does not use the additional 
BOSS information and only uses the host galaxy redshifts ob- 
tained during the running of the SDSS-II survey. 



< 1. 



IV. APPLICATION OF BEAMS 



A. Performance of BEAMS comparisons across 
datasets 



The BEAMS approach can be compared to standard 
techniques, namely the approach applied to sub- 
sets of the dataset resulting from cuts. For the Level I 
Gaussian simulation we define the spectra sample as a 
randomly selected sample of 10% of the points we know 
to be of Type la. This is to match expected future effi- 
ciencies of spectroscopic confirmation; one will always 
be comparing the performance of BEAMS, which takes 
account of contamination within the algorithm, on a 
larger photometric sample against a approach that 
does not directly control for contamination, on smaller, 
but more pure sample. We compare the constraints us- 
ing the three level datasets and various approaches in 
Figure [6j 

In each case, the BEAMS algorithm applied to the data 
gives the tightest constraints that are also consistent 
with the input cosmology (in the case of simulations) 
and the spectroscopic sample (in the case of the data) . 
In the case of the Level I Gaussian simulation (since 
there is no light-curve fitting procedure in this simu- 
lation) the cut sample is taken to be all points with 
probability Pia > 0.9, while for the Level II SNANA 
simulation this is taken as all points that satisfy the 
basic cuts (such as the cut on the A parameter), and 
which also satisfy Pja > 0.9 and the goodness-of-fit 
xfc < Note that the cut on the goodness-of-fit 
is not particularly conservative. In general, the more 
conservative the cut, the less biased the contours be- 
come. However, this is at the cost of the size of the 
contours, which increase, thereby losing the statistical 
power of the large sample. The curve corresponding to 
the spectroscopic subset we define as 'unbiased' since 
they by definition are the contours that would result 
in a contemporary analysis. 

A small (~ Icr) bias is visible in the recovered cosmol- 
ogy in the case of the Level II SNANA simulations. 
This is potentially due to a combination of factors. 
Firstly, a single Gaussian is used to model the dis- 
tance modulus of the non-SNIa population, which we 
can see from Figure |4] is not an accurate description of 
the core-collapse population within the SNANA simu- 
lation. This same substructure is not seen in the data. 






Redshift z 

Figure 6: Comparing analysis techniques on various datasets: The left panel shows the 2a contours in the Q.m,^A 
plane for Levels I-III (top to bottom), while the right panel shows the A/i(2:) for the sample, where the data are colored 
by probability from low (blue) to high (dark brown). In addition, the points which are 'spectroscopic' are colored in black. 
The levels are characterised in Table|l] In each case the BEAMS constraints are consistent with the concordance cosmology 
shown as the filled orange square (which is what was input for the simulated data, and which one might hope to recover 
in the real- world data). The best-fit BEAMS point is given by the black square, while the best-fit cosmology from the 
spectroscopic data is indicated by the brown cross. While the cut approach based on probability of fit (and the parameter 
A in the case of the Level II simulations and Level III data) of light curve templates recovers the sample cosmology as the 
spectra sample for stringent enough cuts, these cuts reduce the sample size significantly. The top left hand panel shows 
how even a relatively stringent cut on probability of Pcut = 0.9 biases the inferred cosmology; stronger cuts will recover 
the true cosmology at the cost of sample size. 
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Dataset 


Level I 


Level II 


Level III 




Gaussian 


SNANA 


SDSS-II SN 




sim 


sim 


data 


Redshift range 


(0.02, 0.9) 


(0.02, 0.45) 


(0.02, 0.45) 


Total photo sample Size 


37529 


35815 


792 


No of spectro points 


2500 


1467 


297 


No of cut points 


7130 


10967 


191 


No of la points 


25000 


30623 


unknown 



TABLE I: Summary of datasets - the redshift distribu- 
tion and sample sizes of the datasets compared in Figure |6] 
The Level I Gaussian simulation and constraints are shown 
in the top row of Figure [6] the Level II SNANA simulation 
is shown in the middle panel, and the Level III SDSS-II SN 
data are shown in the bottom row of Figure [6] in which case 
the true numbers of la SNe in the sample are unknown. 



and hence we motivate that a single population is suf- 
ficient to model the contaminant population. 
This assumption will be relaxed when applying 
BEAMS to the larger SN sample within the BOSS 
data, which is left to future work. In addition, an effi- 
ciency correction for Malmquist bias was made within 
the MLCS2k2 fitting procedure, based on the sample 
of la data. We do not expect the Malmquist bias of 
the la supernova sample to be the same as that of the 
non-la data; this issue will be addressed in detail in 
future work, and is essential for future large photomet- 
ric surveys. An additional source of bias could be due 
to incorrectly assuming a Gaussian likelihood for for 
the la (and non-la) populations, as this would bias all 
cosmological analyses. We leave investigation of non- 
Gaussian likelihoods to future work. 
As is shown in Figure |6] BEAMS recovers the input 
cosmology of the simulations and estimates parame- 
ters consistent with the spectro sample in the case of 
the Level III SDSS-II SN data. Moreover, the BEAMS 
contours are three times smaller than when using the 
spectro sample alone. In the Level 11 SNANA simu- 
lation the contours are ~ 40% the size of the spectro 
sample, while in the case of ideal Level 1 Gaussian sim- 
ulations, the BEAMS contours using all the points are 
~ 16% of the size of the spectro sample. This high- 
lights the potential of photometric supernova cosmol- 
ogy to drastically reduce the size of error contours with 
larger samples while remaining unbiased relative to the 
'known' spectroscopic case. 



B. Scaling of errorbars 

As discussed in Kunz et al. [47, for the one-dimensional 
case, the effective number of SNe that result when ap- 
plying BEAMS scales as the number of spectroscopic 
SNe and the average probability of the dataset mul- 
tiplied by the remainder of the photometric sample, 
a — >■ cr/ i/iVspec + (^la) A^photo- In the two-dimensional 
case, the square root would be removed as the area of 
the ellipse scales with the increase in the effective num- 
ber of supernovae. In our applications we have, how- 
ever, not used the fact that we know that some points 
are confirmed as Type la. In other words, the proba- 
bility of each data point was taken from the light-curve 
fitter and was not adjusted to one or zero depending 
on the known type. Hence we expect the size of the 
contours in the i — j plane to scale as 



a 



1/2 



a 



1/2 



(^la)^, 



photo 



(16) 



We compute the size of the error ellipse for various 
Level I simulations as a function of the size of the sim- 
ulation, shown in Figure [7j for one particular model 
of the probabilities, and hence one value of {P). We 
impose a prior on the densities, and hence the ellipses 
are not closed for smaller samples. For large enough 
sample sizes the ellipse is closed and we observe that 
the error ellipses scale in area as (x l/{Pia)N, which is 
consistent with earlier results [17] . In general then, one 
would obtain a different constant factor (P) in Figure [t] 
for different simulated probability distributions. Large 
supernova surveys will not only increase the total num- 
ber of Type la SNe candidates, but will allow one to 
investigate systematics about the SNe populations di- 
rectly. The BEAMS algorithm is designed to include 
and adapt to information about the non-SNla popula- 
tion easily. By adapting the form of the non-SNla pop- 
ulation, and including more than one population group, 
one could use BEAMS to gain insight into the contam- 
inant distribution. 



Constraining T{z) forms for the 
non-SNla population 

1. Level I Gaussian simulation 



The Gaussian simulation described in Section III CI uses 
a quadratic model for the differences between the stan- 
dard ACDM fi{z) and the non-SNla distance modulus. 
We test here that assuming a different functional form 
while performing parameter estimation does not sig- 



1.1 




Number of SNe 

Figure 7: Errors scale with number of SNe: tiie size 
of tiie error ellipse, approximated by the square root of the 
determinant of the two-dimensional chain of flm, shows 
the reduction in size with increasing the number of SNe in 
the simulation. 



nificantly bias the inferred cosmology. We define the 
effective as — 2 In £, where the posterior £ is a linear 
sum of the terms in Equation ([5| , and provide values 
relative to the simplest linear model for T(z). In Fig- 
ure[8]we show that BEAMS is reasonably insensitive to 
the assumed form of the non-SNIa likelihood, provided 
it is allowed enough freedom to capture the underlying 
model. A linear model fails to recover the correct cos- 
mology, as it does not have enough freedom to recover 
the difference between the la and non-SNIa distribu- 
tion. It correspondingly has a very high relative to 
the other approaches. The higher-order functions re- 
cover consistent cosmologies, and the of these mod- 
els improves by Ax^ < 0.5, even though the models 
have increased the number of parameters by one. 



2. Level II: SNANA simulations 

In the case of the Level I Gaussian simulated data, the 
explicit form of the T(z) was specified as quadratic and 
then various other forms for T(z) were fit for within the 
BEAMS approach. In general, any model with enough 
freedom (of equal or higher order to the input model) 
managed to recover the input cosmology. We apply this 
test to the Level II SNANA simulations, where the data 
are generated from fitting generated light-curve data 
to templates. Naively one might not expect a simple 
model to fit the data. In general we find that while a 
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Figure 8: Different T(2;) distributions for the non-la 
likelihoods: 2(7 constraints in the VLa plane for differ- 
ent versions of the non-la distance modulus function, for 
the Level I Gaussian simulation (top panel), the Level 11 
SNANA simulation (middle panel) and the Level 111 SDSS- 
11 SN data (bottom panel). In the case of the Level 1 sim- 
ulation, we simulated a quadratic model, and ran BEAMS 
assuming a linear, quadratic, cubic and Fade form for T(z), 
as described in Section 111 CI For the other two cases the 
underlying distribution for the non-SNIa distance modu- 
lus is not known analytically; we test for the same models 
as for the Level I simulation; the legend is the same for 
all panels. In the Level I Gaussian simulation the linear 
model does not have enough freedom to capture the non- 
SNIa distribution (as expected, since the input model was a 
quadratic function) . This behavior is also seen in the Level 
11 SNANA simulations. The Level III SDSS data does not 
have a particular preference of the form used, as the num- 
ber of SNe in the sample is not large enough to constrain 
the non-SNIa population. The goodness-of-fit of the distri- 
butions to the data are summarized in Tables [Ilj |I11| and 
|lV|for the Level 1, 11 and III cases respectively. 
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Model 




Parameters 


T(z) = az + c 





2 


T{z) = az + bz"^ + c 


-192.9 


3 


T{z) ^ az + bz^ +cz'* + d 


-193.3 


4 


T(z) = {az + bz^ + c)/(l + dz) 


-193.4 


4 



''Difference in the effective between a given model and the 
linear case, which has x^g = 42526.2. 

TABLE II: Comparison of non-SNIa likelihood mod- 
els for Level I Gaussian simulation: values for the 
fits using various forms of the non-SNIa likelihood for the 
Level I simulations, where the true underlying model is a 
quadratic. The constraints on f2m,,f2A are shown in Fig- 
ured 



cubic model performed the best at fitting the data (it 
had the lowest relative to the linear model), the 
inferred cosmology in the quadratic case is consistent 
with the input cosmology. Fitting the Level 11 SNANA 
simulated data with a linear model led to large a bias 
in the inferred cosmology, as shown in Figure |8] 



Model 


Axcff" 


Parameters 


y{z) = az + c 





2 


T{zf = az + bz^ + c 


-135.1 


3 


T{z) = az + bz^ -1- cz^ -f d 


-287.2 


4 


T(z) = {az + bz"^ + c)/(l + dz) 


-206.1 


4 



"Difference in the effective between a given model and the 

hnear case, which had x^fj = 10092.6 . 

''In the quadratic case, the best-fit values for the 

non-SNIa distribution parameters were 

{a?,a^,a^,tjr,sr) = (1.99,-13.99,20.32,0.06,0.93) 



TABLE III: Comparison of non-SNIa likelihood mod- 
els for Level II SNANA simulation: x^ values for the 
fits using various forms of the non-SNIa likelihood for the 
Level II simulations, where the true underlying model is 
unknown. The constraints on Q.rn,^A are consistent for all 
models of second order or higher. 



3. Level III: SDSS-II data 

In the case of the Level 111 SDSS-11 SN data, we shaU 
let the data inform us of the best choice of model for 
the non-SNIa distribution. In an observed sample of 
non-SNIa data, the theoretical distance modulus de- 



Model 


AXeff'^ 


Parameters 


r{z) = a'^z + a° 





2 


Parameters {a^,a'^) 


( 


-5.3,1.7) 


r{z) =a^z + a^z^ + a° 


-1.3 


3 


Parameters (a^, a^, a*"*) 


(-9.69,10.73,2.1) 


T(z) = a^z + a^z^ + a^z^ + a" 


-2.9 


4 


Parameters(a^, a^, a"*, a") 


(0.59,- 


-4.97,9.98,1.64) 


r{z) = {a^z + a^z^ a^)l{\ dz) 


-0.2 


4 


Parameters {a} ,c? ,a^ ,d) 


(-33.3,83.4,1.43,-19) 


° Difference in the effective x^ between 
linear case, which has x^g = 1215.3. 


a given 


model and the 



TABLE IV: Comparison of non-SNIa likelihood mod- 
els for Level III SDSS-II SN data: x^ values for the 
fits using various forms of the non-SNIa likelihood for the 
SDSS-III data. While the x^ decreases as the number of pa- 
rameters increases, it does not decrease significantly given 
the amount of freedom in the higher order models. The con- 
straints on Q.m,^K from these fits are shown in Figure [8] 
The parameter values are the mean of the one-dimensional 
likelihood for the model parameters. This form also ap- 
pears to be consistent with the SNANA simulated data (see 
Figure [3|. 



pends on their apparent magnitudes, which in turn de- 
pend on redshift and survey limiting magnitude, and 
the absolute magnitudes of the non-SNIa population, 
which are drawn from an unknown luminosity func- 
tion. In the large supernova limit we will learn about 
the distribution of those SNe that are not from the la 
distribution, however we treat them here as nuisance 
parameters that we marginalize over, rather than using 



a 'hard-coded' empirical relation. Table IV shows the 
various forms of the non-SNIa distribution considered, 
and the of the fit. 

The data seem consistent with a quadratic model, and 
the constraints do not change significantly for any as- 
sumed form, as shown in Figure |8] It is clear from the 
limited amount of data in the current SDSS-11 SN sam- 
ple that a complicated form is unjustified at present, 
however this will be tested as the amount of super- 
nova candidates increases with the SDSS-11 SN with 
host redshifts from BOSS (and future large photomet- 
ric surveys such as LSST and DES). 
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D. BEAMS posterior probabilities 

BEAMS uses probability information from photomet- 
ric la candidates in the hkehhood to determine cosmo- 
logical parameters. In this section, however, we illus- 
trate in addition that BEAMS can test whether a given 
set of probabilities are biased, and can allow for uncer- 
tainty in the probabilities themselves while computing 
cosmological constraints. 



1. Methodology 

The probability Pi is a prior probability from the ear- 
lier fitting and mapping process on whether or not 
the specific data belongs to the Type la population 
of supernovae. Using accurate probabilities within the 
BEAMS framework leads to the greatest reduction in 
the size of the parameter contours, while controlling for 
bias, as is shown in Figure [6j But one might ask, can 
BEAMS itself recover probability information from the 
data? The answer, as we will discuss below, is yes. In- 
deed, BEAMS posterior probabilities can be used as a 
check for bias in the input probabilities of the data. As 
described in Kunz et al. j47j . however, one can promote 
Pi to a free variable, and its posterior distribution then 
contains information on how well it fits the la or then 
non-la class of supernovae. If we leave just Pi for one 
i free, and fix all other parameters, then the posterior 
becomes 

p{e\^Ji) (X l^n^^ieiMjoj {P{iH\e,n = m 

+P{li,\9,T, = 0){1 - P,)} 



(17) 



where P(0|^j) is the posterior at the fixed parame- 
ter vector 6 containing both the cosmological parame- 
ters and the additional parameters describing the non- 
SNIa population (Eqs. (14) and (15)) for all super- 
novae except i. The above expression is just a straight 
line going from Jlj^i P{^\f^j)P{fM\d, Ti ~ 0) at the in- 
tercept Pi = to the value Jlj^i -^l^l/^jO^lMil^i — 
1) at Pi = 1. In general, we do not fix the parameters 
but sample from the full posterior, and then marginal- 
ize over everything except Pi. This results again in the 
posterior for Pi being a straight line. 
To extract the model probabilities corresponding to su- 
pernova i being of Type la or not, as opposed to the 
posterior distribution of the parameter Pi , we take re- 
course to the Savage-Dickey density ratio [Ml Ei] : In 



nested models the relative model probability (in favor 
of the more simple model) is the ratio of the posterior 
divided by the prior at the nested point. The two mod- 
els 7Wi = "Ia" and A^2="iiot la" are not nested, but we 
can use a trick by extending our model space with a 
third model A^3="Pi free". Then the two models are 
nested in that third model at the points P^ = 1 and 
Pi — respectively. Therefore the relative probabilities 



P^\^ = P(X3)/P(Xi) and 5^2^ = P{M3)/P{M2) for 
supernova i can be extracted from a MCMC chain with 
free probability P^, by looking at the end-points of the 
normalized posterior for P^, marginalized over all other 
parameters. Given the discussion above on the shape 
of the posterior of Pi, what we do in practice is to 
fit a straight line to the distribution of Pi values of a 
MCMC chain in which we left Pi free. The values at 
the end points give directly S31 and P32. The relative 
probability between models A4i and A^2 is now simply 

To which value should we set the probabilities that we 
keep fixed? A natural possibility would be to use the 
output of a prior typing stage, but this choice involves 
the risk that the prior probabilities could be biased. 
Instead we could use P — 1/2 to convey the minimal 
amount of extra information. In this case we should 
also use the A parameter to allow for an automatic 
correction of different total numbers of supernovae of 
different types. This choice has another advantage: as 
shown in Section IVA of [47] we get effectively P = 1/2 
if we marginalize over a fully free P, so this is also the 
choice where we let all Pj 's float freely and marginalize 
over all but Pi. For this reason we will use P = 1/2 
together with a free global A in the remainder of this 
section. 



(i) _ 



2. Toy-model illustration of posterior probabilities 

Let us illustrate the meaning of the posterior proba- 
bilities that we expect to find if BEAMS works with 
a simple toy model: we assume that we are dealing 
with two populations (let us call them 'red' and 'blue') 
drawn from two normal distributions with means at 
±9 and equal variances of cr^ = 1, see the top panel of 
Figure [9j 

We use this toy model specifically to highlight the most 
important elements in estimating the posterior proba- 
bilities, in the case where the populations are similar 
(e.g. they have equal variances) and to highlight the 
necessity of the normalization/rate parameter A. We 
will also see that unbiased probabilities imply that a 
small peak at low probability for the la or high proba- 
bility for the non-la is actually 'right' and is what we 
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should expect. 
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Figure 9: Posterior probabilities: the top panel provides 
an illustration of the two toy distributions, in the case of 
e = 0.5,1.0,2.0 (left to right). The bottom panel shows 
the probability histogram density plots, or number of red 
points with a given probability, where dN^^\P) is given in 
Eq. (l22| for e = 0.5 (blue), 1 (red) and 2 (yellow). 



The equality of the variances of the two populations 
means that we are measuring the distance A — 29 be- 
tween the two mean values in units of the standard de- 
viation. We also allow for different numbers of points 
drawn from the red and blue Gaussians through a 'rate 
parameter' p € [0, 1] that gives the probability to draw 
a red point. If we draw N points in total, we will 
then have on average pN red points and (1 — p)N blue 
points. The likelihood for a set of points {xj}, with j 



running from 1 to N, is then 



N 



(1 - P)e 



(18) 



for P = p. 

To simplify the analysis we assume that we are deal- 
ing with large samples so that is determined to high 
precision, with an error much smaller than cr. In this 
case (and since this is a toy model) we can take the 
parameter 6 fixed. We also note that if we are running 
this in BEAMS with a true prior probability P = p 
then we would find a normalization parameter A — 1^ 
while for P = 1/2 we would obtain A = p/{l — p), and 
we again assume that this parameter can be fixed to its 
true value. Then it is easy to see that if we leave the 
probability for point i, Pj, free, we find a Bayes factor 



B 



p{{x,}\e,p, = i) ^ e-^e^-^-)^ 
p{{x,}\e,p,^o) 



20X 



(19) 



In other words, \x\{B) = x^A, just the value of the data 
point times the separation of the means. If the point 
is exactly in between the two distributions Xi — Q then 
i? = 1, i.e. its BEAMS posterior probability to be red 
or blue is equal. Notice that the answer is independent 
of the value of p and this happens because we have 
removed any influence of the rate parameter A on the 
free Pi. This means that if we want to think of the 
BEAMS posterior probability as the probability to be 
red or blue, we should update the Bayes factor with 
A, i.e. use B = BA^ with an associated probability 
P = B/{l + B). We also see that the probability to be 
red increases exponentially as Xi increases. As we will 
see below, this reflects the fact that the number of red 
points relative to the blue points increases in the same 
way. The rapidity of this increase is governed by the 
separation. A, of the two distributions. 
What is the distribution of the posterior probabilities, 
i.e. the histogram of probability values, and what de- 
termines how well BEAMS does as a typer in this 
example? The number of red points in an interval 
[x, X + dx] is just given by the 'red' probability dis- 
tribution function at this value, times dx. To plot this 
function in terms of P we also need 



xiP) 

dP 

dx 



ln(B) ln(P/(l - P)) 



A 

APfl 



A 



(20) 
(21) 



The probability histograms for the red (r) and blue (b) 
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points, normalized to p and 1 — p respectively, then are 

dP 



dN'^^'HP) = , 



(22) 



X exp ■ 



1 (\n[P/{l - P)] 
A 



dN^''\P) = 



1 - p dP 
27rA P(l - P) 



(23) 



X exp ■ 



1 /^ln[F/(l - P)] 
A 




We plot dN^'^^/dP/p for 6* = 0.5, 1 and 2 in the lower 
panel of Figure |9] We see how the values become more 
concentrated around P = 1 for larger separation of the 
distributions, i.e. BEAMS becomes a "better" typer. 
But for very large separations there are also suddenly 
more supernovae at low P (yellow curve) . The reason is 
that BEAMS does not try to be the best possible typer, 
instead it respects the condition that the probabilities 
have to be unbiased, in the sense that 



P 



I-PJ \l-p 



P 



= BA = B. 



(24) 



Since BEAMS only uses the information coming from 
the distribution of the values, its power, as reflected in 
the distribution of probability values dN{P), is given 
by how strongly the distributions are separated. If they 
are identical {9 = 0) then BEAMS can only return 
P = 1/2 while for larger 6 there is a stronger pref- 
erence for one type over another. But given the two 
populations, we can in principle derive the probability 
histogram by just looking at the ratio of data points of 
either type at each point in data space, there is nothing 
else BEAMS can do. Also, in order for the probabil- 
ities to be unbiased (up to the rates which arc taken 
into account by A) if there are, say, 200 red points in 
the P = 0.9 bin and only 10 in the P = 0.8 bin, then 
we need to find about two blue points in the P — 0.8 
bin, but 20 in the P = 0.9 bin. Although this looks 
like a significant misclassification problem, it is just a 
reflection of Eq. ( 24 ) and is actually the desired behav- 
ior. 



3. Application to Level II 

In order to check whether BEAMS is able to pro- 
duce posterior probabilities with the expected prop- 
erties, we ran it on the Level II SNANA simulation 
for constant P = 1/2 prior probabilities, allowing for a 
free A. We plot the two probability histograms in the 



upper panel of Figure [TOj for probabilities that were 
updated with the posterior value of the rate parame- 
ter, A — 0.33. The cuts on the A parameter and re- 
moving all points with probability P^t < 0.01 reduce 
the number of non-SNela in the sample at low prob- 
ability, while there is quite a spread in the probabili- 
ties of the Type la supernovae. The right-hand panel 
shows the BEAMS posterior probabilities - the nor- 
malization parameter A allows BEAMS to adjust the 
high probability non-SNIa to low probability, and to 
sharpen the Type la probabilties around high proba- 
bility. The bottom panel of the figure shows the ra- 
tio of the two histograms in the upper panel to test 
whether we recover Eq. ( [24| . Given that B is the ratio 
of the number of la to non-SNIa points, we have that 
P = ln(P) — \n{Nig^) — ln(A'^,ionia)- Hence, the error 
bars on ln(P) are taken to be 



aiPf 



1 



1 



(PiV^^) ((l~P)A^i^)) 
le to' 

in the probability bin. 



(25) 



where N^^^ are the total actual number of supernovae 



4- Approximate methods 

The procedure to extract the posterior probabilities as 
outlined above is rather slow, as we need to run a full 
MCMC analysis for each supernova. This is only so 
because we evaluate the posterior for Pi given all other 
probabilities fixed to their mapped values. Leaving all 
the probabilities free would lead to a very high dimen- 
sional and complex posterior that would be very hard 
to sample from. However, since we are working at any 
rate in the limit of at most weak correlations between 
supernovae, we can leave free a subset n of the P; si- 
multaneously. It is better if n is much smaller than the 
total number of supernovae, and not too large in any 
case, for example n = 10. In addition, the more uncor- 
related the Pi, the easier it is to sample from the total 
posterior. But in this way we speed the process up by 
a factor of n, making it more tractable for large sets. 
Additionally, since the runs are independent, they can 
be performed on a large computer in parallel, so that 
even large supernova samples can be analyzed with the 
computational resources available to the typical astro- 
physics department (for example, we ran the Level II 
SNANA analysis above without this trick in just a day 
on the local cluster). 

A quicker method of determining the posterior prob- 
abilities is obtained by taking the ratio of the proba- 
bility that a data point, i is from the la type relative 
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Figure 10: Posterior probabilities within BEAMS for the Level II SNANA simulation: the top left panel 
shows the histogram of the MLCS2k2 fitter probabilities for the 35815 SNe in the SNANA simulation, while the top 
right-hand panel shows the histogram for the same data, where the probabilities are taken as the BEAMS-posterior 
estimated probabilities obtained using the A parameter. In both cases the probabilities are separated according to the true 
(input) type of the points, either la or non-la. There are few non-SNIa points in the quality controlled Level II SNANA 
simulation, but a reasonably small number of low-probability non-SNIa , compared to the BEAMS posterior probabilities 
on the same data. This is illustrated in the bottom panel, where we plot the ratio of SNIa to non-SNIa in probability bins 
for the MLCS2k2 probabilities (black crosses) and BEAMS posterior probabilities (grey dots) compared to the theoretical 
expectation ln(i3) — ln(P/(l — P)) (orange curve and error bars). For example, the dot for the 90% bin will lie on the 
theoretical curve if indeed 90% of the supernovae with Pja ~ 0.9 are of actually Type la, and the equivalent for the other 
bins. The fact that the histogram of BEAMS posterior probabilities (top right panel) shows more non-SNela are assigned 
low probability explains why there is less bias at low probability in the bottom panel. 



to the probability that is is a non-SNIa. If instead 
of marginalizing over all other parameters, we evalu- 
ate the posterior at the maximum likelihood point for 
the cosmological ■parameters^ where 6 — 6*, the ratio 
is simply given by 



P{^l^\6*,r, = 1) 
P(/x,|0*,T, = 0)- 



(26) 



We compare the approach to the computationally in- 
tensive one discussed above in Figure [TT] as a function 
of true SN type for the Level II SNANA simulation. 
In general the probabilities are consistent, especially 
in the case of the la SNe, with no SNIa being given a 
high probability in the full approach and being given 
a low probability using the ratio of maximum likeli- 
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Figure 11: Approximate and direct methods for ob- 
taining the posterior probabilities. There is a tight 
agreement between the probabilities obtained through full 
MCMC runs and the approximate approach taking the ra- 
tio of the la to the non-SNIa likelihoods at the maximum 
likelihood point (for the cosmological parameters). 



hoods. The mean and standard deviation of the dis- 
tribution of residuals between the two approaches are 
SPia = 0.005±0.015. As such, the approximate method 
provides a robust check of the full approach, as differ- 
ences in the probabilities are mostly related to conver- 
gence properties of the full estimation. As might be 
expected, non-SNela that are given a high probability 
using the full method are also given a high Pja using 
the approximate method. 



V. TESTS AND CHECKS FOR BIAS 

A. Dependence on error accuracy 

The full error on the distance modulus is given in 
Eq. (Ill - where the error combines measurement er- 



ror (from light-curve fitting) , intrinsic dispersion (from 
the absolute magnitude distribution of the SNe) and 
peculiar velocity error. In general the peculiar veloc- 
ity error is degenerate with the non-SNIa distribution 
characteristics in that the velocity error tends to in- 
crease the errors at low redshift. However, the in- 
trinsic dispersion of the non-SNIa effectively controls 
the spread in the distribution, which we know to in- 
crease at low redshift. Hence, fitting for the veloc- 
ity and intrinsic dispersion together can lead to one 
being unconstrained. We test for the dependence of 
the cosmological results on this effect in the Level I 



Gaussian simulation only, where the input simulated 
data model is completely understood. In the cosmo- 
logical analysis we set the peculiar velocity term to be 
set by Vpec — 300kms~^ [H], however doubling Vpec 
to 600kms^ does not change the inferred cosmology. 
When we allow Vpec to be free, we find it unconstrained 
by the data, with the minimum value saturating the 
lower bound of the prior of \og(vpec/c) = —20, and the 
maximum given by Vpi,^ < 1922kms~"'^. 



B. Dependence on la/non-Ia rates 

An additional complication to the probabilities is the 
dependence of the probabilities with redshift. This red- 
shift dependence occurs as a result of the fact that the 
signal-to-noise ratio changes as a function of redshift, 
and the effective rest-frame filters used to type SNe. 
The relative numbers of SNe at a given redshift depend 
on the various SN rates (or number of explosions per 
year per unit volume). In general, non-SNIa rates are 
less certain than Type la rates, since SNe are mainly 
followed up in a cosmological survey if they already 
appear to be good la candidates. As a test of this 
dependence, we modify the probabilities in two ways: 
firstly, we scale the true probabilities as a function of 
redshift as 



Pja,, = min ( Pia , 1 ) , (27) 



which increases the probability of being la of data at 
higher redshift. In this case the fact that there is no 
redshift dependence in A itself introduces a slight bias 
in the inferred cosmology, as is shown in Figure |12[ 
with the input cosmology only recovered at 2cr (the 
filled 1- and 2a contours from the linear unbiased case 
are shown for comparison). An alternative way of prob- 
ing this dependence is by artificially changing the rela- 
tive numbers of la to non-SNIa SNe in a given simula- 
tion. We do this by simply removing a subset of the la 
data (where the data are binned in ten redshift bins) 
after assigning probabilities to ensure that we are ef- 
fectively biasing the probabilities - or rather, that the 
probability of being a Type la at a given redshift will 
not reflect how many la actually exist at that redshift. 



This case is also shown in Figure 12 for the Level I 
Gaussian simulation and the Level II SNANA simula- 
tion, where the contours are slightly larger than the 
standard case (given that data are removed), but are 
consistent with the input cosmology at Ict. 
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Figure 12: The dependence of the probabilities on 
SN rate: the 2cr contours in the Qrn,^A plane when con- 
sidering two different methods of introducing a redshift de- 
pendence on the probabilities for the Level I Gaussian sim- 
ulation (top panel) and the Level II SNANA simulation 
(bottom panel). The brown contours show the case where 
the Pia probabilities were explicitly changed to depend on 
redshift through Eq. (27 1. The black curves illustrate the 



case where the overall probabilities are left unchanged, but 
the number of la SNe relative to the non-SNela is changed 
as a function of redshift. 



C. Dependence on Probability 

The BEAMS algorithm naturally uses some indica- 
tion of the probability of a data point to belong to 
the la population, whether it is some measure of the 



goodness-of-fit of the data to a Type la light-curve 
template, or something more robust such as the rel- 
ative probability that the point is a la compared to 
the probability of being of a different type. By includ- 
ing a normalization factor, we can correct for general 
biases in the probabilities of the la points. One might 
still question, however, how sensitive BEAMS is to the 
input probability of the objects. For the Level I Gaus- 
sian simulation, where we assign the probabilities, Pja, 
directly we can change the relationship between the 
true underlying distribution of the types (i.e. the ratio 
of las to non-las in the sample) and the input prob- 
ability value (the number we input into the BEAMS 
algorithm as the Pia). If the probabilities are unbiased 
then the distribution of types should follow the prob- 
ability distribution of the data, in other words 60% 
of the points with Pja = 0.6 should be Type la SNe. 
This is the standard case. We then modify the prob- 
abilities by assigning a probability of Pja = 0.3 to all 
points (which we know will be biased since the mean 
probability of the sample is 0.667). 
We compare the constraints in the two cases in Fig- 
ure |13| If we ignore all probability information and 
set it to a (biased) value of Pja = 0.3, the probability 
information is essentially controlled by the normaliza- 
tion parameter. A tends to a value of 4.7, which, when 
inserted into Equation (|4| yields a 'normalized' prob- 
abihty of Pja 0.668. Hence BEAMS uses the nor- 
malization parameter to remap the mean of the given 
probabilities to ones that have a mean that fits the 
true unbiased probabilities. In correcting for this ef- 
fect, BEAMS manages to recover cosmological param- 
eters consistent with the unbiased case. 
For the Level II SNANA simulation, the true underly- 
ing probability distribution is more complicated, and 
so we test for dependence on probability in a different 
way. The SNANA simulation mimics real-life obser- 
vations in that it treats the simulated light-curves as 
'real' data and fits them in the same way one would 
fit and analyze current data. The main bias from this 
dataset will be any bias introduced in the probabili- 
ties of the data to be of Type la, since we have no 
guarantee a priori that the probabilities will be unbi- 
ased. We thus fit for the cosmology assuming different 
proxies for the probability, either taking the probabil- 
ity from the light-curve fitter alone, Pat or setting the 
probabilities to an arbitrary value of P = 1/2. In the 
Level III SDSS-II SN data, we add in a case where 
an additional, typer probability Ptyper 

[SZllSH] is used, 

which computes the relative goodness-of-fit of differ- 
ent la and non-SNIa templates to the data. In the 
case of the Level III SDSS-II SN data, the average 
probability obtained using the PSNID fitting proce- 
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Figure 13: BEAMS corrects for biased input prob- 
ability: the marginalized one-dimensional likelihood for 
the normalization parameter A (top panel) and estimated 
contours (bottom panel) for Level I Gaussian simulation 
under two forms of the probability distribution. The pink 
curve and contours correspond to the nominal case, where 
the probabilities are generated in a linear model, and the 
types are assigned according to the probabilities. The pur- 
ple dashed contours correspond to assigning a probability 
of Pia — 0.3 to all points. The dashed vertical lines show 
the expected value of the parameter A such that the true 
input mean probability of Pia = 0.667 is recovered. Note 
that the a;-axis in the top panel has been shortened to allow 
for comparison of the two distributions. 



Figure 14: Using different probabilities for the Level 
II SNANA simulations and the Level III SDSS-II 
SN data: Top: the blue filled 1,2a contours show the 
constraints when setting the probabilities of all points to 
P — 0.5, while the orange dashed curves show the 2a 
contours when using the goodness-of-fit probabilities from 
the MLCS2k2 fitter for the Level II SNANA simulation. 
Bottom: the solid 1,2a blue contours are from ignoring 
probability information, and setting all the probabilities 
of the points to = 0.5 for the Level III SDSS-II SN 
data. The light curves {2a constraints) result when us- 
ing the MLCS2k2 goodness-of-fit probability, which is un- 
normalized relative to the other types, and is typically low 
for the sample. The dark purple contours are from using 
the probabilities for each point from the PSNID prescrip- 
tion Sako et al. |37II38| . also at 2a. In both cases, the effect 
is a shift of < la in the inferred cosmological contours. 
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dure is (Ptyper) = 0.79. In this case the normahzation 
parameter A peaked around 0.3, resulting in a nor- 
mahzed average probabihty of (Plfl) = 0.525. In the 
case where the MLCS2k2 goodness-of-fit probabilities 
are used, the average probability of being a la is lower, 
(Pfit) = 0.41. In this case the A parameter is dis- 
tributed around A = 25, leading to {Pj^}) = 0.95 - 
BEAMS tries to increase the average probability of all 
points to be close to unity. Finally when setting the 
probability to 0.5, A is centered around 2.5, leading to 
i^iai) — 0-71. Typing SNe effectively is an active area 
of research [30H35l [37l [38l l40l [66] . indeed, a recent com- 
munity wide challenge provided a way of testing the 
ability of various approaches to type SN efficiently (see 
Kessler et al. and references therein). With more 
data and improved algorithms, the probabilities used in 
photometric SN analysis will greatly improve. As Fig- 
ure [14] illustrates, however, BEAMS can use the min- 
imal amount of probability information, and recover 
consistent results. 

VI. CONCLUSIONS AND OUTLOOK 



Bayesian Estimation Applied to Multiple Species 
(BEAMS) is a statistically robust method of param- 
eter estimation in the presence of contamination. The 
key power of BEAMS is in the fact that it makes use 
of all available data, hence reducing the statistical er- 
ror of the measurement, whether or not the purity of 
the sample can be guaranteed. Rather than discard- 
ing data, the probability that the data are "pure" is 
used as a weight in the full Bayesian posterior, reduc- 
ing potential bias from the interloper distribution. We 
summarize the paper as follows: 

• We tested the BEAMS algorithm on an ideal 
Gaussian simulation of 37529 SNe, consisting of 
one population of non-SNela and one SNIa pop- 
ulation. We showed that the area of the con- 
tours in the fi™, JIa plane when using BEAMS is 
six times smaller than when using only a small 
spectroscopic subsample of the data. In addition, 
we showed that the size of the error ellipse using 
BEAMS decreases as Eq. (lie]). 



• We tested the BEAMS algorithm on a more re- 
alistic simulated sample of 35815 SNe obtained 
from light-curve fitting 58 , which includes many 
more than two populations of non-SNIa , and dis- 
cussed the validity of the single non-SNIa pop- 
ulation assumed in this version of the BEAMS 
algorithm. A simple two parameter model fails 
to completely describe the distribution - how- 



ever the constraints using BEAMS are not sig- 
nificantly biased. 

• We applied BEAMS to the SDSS-II SN dataset 
of 792 SNe, using photometric data points with 
host galaxy spectroscopic redshifts, and showed 
that the BEAMS contours are three times smaller 
than when using only the spectroscopically con- 
firmed sample of 297 SNe la. 

• In both the 'realistic' and Gaussian simulations, 
we assume a variety of models for the distance 
modulus distribution of the non-SNIa popula- 
tion, and test for a dependence of the inferred 
cosmology on the assumed form of the distance 
modulus function. BEAMS requires a model 
with enough freedom to capture the behavior 
as a function of redshift: functions of quadratic 
and higher order are required to fit the Level II 
SNANA simulations, while no strong preference 
is seen for any particular model using the SDSS- 
II SN sample. 

• We investigated possible biases introduced 
through incorrect probability or rate information, 
or error accuracy, showing that BEAMS can cor- 
rect for the biases when suitable nuisance param- 
eters were marginalised over. 

• We discussed the ability of BEAMS to determine 
the posterior probability of a point based on its fit 
to the best-fit model. Posterior probabilities es- 
timated through BEAMS more accurately model 
the relative probabilities of the SN types. 

As mentioned above, we have restricted ourselves 
to the binomial case of a SNIa population and one 
general core-collapse, or non-SNIa, population. While 
this assumption is valid for the SDSS-II SN data, 
we see that a more complicated model with at least 
two separate non-SNIa Gaussians is more appropriate 
for the Level II SNANA simulations (see Figure |4|. 
This remains to be confirmed with large photometric 
datasets such as BOSS. The BEAMS method can 
easily be extended to the multinomial case, as we 
learn more about the distributions of the contaminant 
populations - this will be performed in the upcoming 
BEAMS analysis of the SDSS supernovae with host 
redshifts obtained through the BOSS survey. We 
list some general lessons for photometric supernova 
cosmology with BEAMS in Appendix [B] 

In addition to the binomial approximation for the 
likelihoods, we have also assumed that host galaxy 
redshifts are known for all SNe. One can include 
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photometric redshift error by looping N times, once 
per supernova and marginalizing over the redshift 
of the z-th SN in each case (in a similar fashion as 
the method of obtaining the post facto estimation 
of the la probability of each SN in Section IV D ) . 



The sensitivity to redshift error can also be included 
directly through fitting for cosmology directly in 
'light-curve' space (see for example March et al. [671 )• 
However, we leave this general treatment of redshift 
uncertainty in BEAMS to future work. 

In the specific case of the SDSS-II Supernova Survey, 
we will apply BEAMS to the larger SN sample with 
spectroscopic host galaxy redshifts from the BOSS 
survey. Not only will the sample size of data points 
with accurate host redshifts increase (thereby further 
reducing the cosmological contours from photometric 
data) but the larger sample of non-SNela will allow 
one to easily calibrate the constraints obtained with 
the current application of BEAMS against the case 
where one only has photometric redshifts, which will 
be the case for LSST gT]. 

With the wealth of new photometric data awaiting SN 
cosmology, BEAMS provides a platform to learn more 
about the SN populations while at the same time tack- 
ling the fundamental questions about the constituents 
of the universe. 
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Fit probability 

Figure 15: Distance modulus-probability correla- 
tions in the Level II SNANA simulations: while a cor- 
relation exists between the Type la probability (determined 
from the fit to the light curve models within SNANA) 
and supernova type - 'true' las have a higher probability 
of being la - there is no significant correlation between the 
residual distance modulus of the SNANA simulation and 
the Pia probability. 



Appendix A: Probability correlations 



As we highlighted in Section |IVD[ the normalization 
parameter A is crucial to normalize the probabilities 
in order to avoid biases introduced by the fitting or 
typing procedure. Moreover, it provides a mechanism 
for removing the strong dependence on probability di- 
rectly, by allowing BEAMS to adjust the probabilities 
according to the global fit to a distance modulus func- 
tion. In Figure [15] we show that there are no strong 
correlations between the la probability determined 
from the SNANA fits to the light curve models and 
the difference between the input cosmology and the 
inferred cosmological model. In general there is more 
spread in the non-SNIa population, however there are 
some non-SNIa points with high probability, and there 



are low-probability la data. We leave the investigation 
of different forms for A to future work. 

Appendix B: BEAMS troubleshooting 

Analysis of purely photometric data brings its own 
challenges to the fore. We briefly highlight some con- 
siderations when applying the algorithm to such data. 

• In order for BEAMS to recover the correct 
cosmology, it requires the freedom to capture the 
characteristics of the non-SNIa (and indeed the 
SNIa) distributions. In particular, we found that 
the error analysis has a significant impact on 
the inferred cosmology, both within BEAMS but 
equally for a basic approach. Our addition 
of two different intrinsic dispersion terms for 
the la and non-SNIa populations effectively 
change the relative weighting of the populations 
in a consistent manner, while still taking into 
account the measurement error on each point, 
which may or may not be a function of type. 



When applying fitting procedures such as 
MLCS2k2 to the dataset, efficiency maps (to 
account for Malmquist bias, for example) should 
be carefully calibrated not to introduce redshift 
dependent biases to the dataset. Alternatively, 
the BEAMS likelihood can be adjusted from a 
standard Gaussian to a truncated or deformed 
Gaussian distribution to account for the selection 
bias in the survey. We leave this investigation 
for future work. 



• As the amount of observations of the contami- 
nants increases, new forms of the non-SNIa dis- 
tance modulus function may be more strongly 
motivated by the data. While we have tested 
various forms for simulated SNANA data and for 
the SDSS-II survey data, these functions should 
be varied to allow the model enough freedom to 
capture the deviations from the standard la dis- 
tance modulus relation. In addition, future data 
may motivate for multiple populations, a feature 
which is easily included in BEAMS. 



